AITopics | top code

LocallyHierarchicalAuto-RegressiveModelingfor ImageGeneration SupplementaryDocument

Neural Information Processing SystemsFeb-9-2026, 13:25:14 GMT

At the first epoch, learning rate is warmed up gradually fromlrinit = 1 10 5 to lrpeak. Figure A and B demonstrate the performances of the baseline and rejection sampling by varying hyperparameterssuchastop-k,softmaxtemperature,andacceptanceratio.Forthebaselinesampling in ImageNet, the hyperparameter setting withk = 2048 and temperaturet = 0.95 achieves the best FID performance in the small and medium models and the second-best performance in the large model. Figure C: Examples of reconstructed images using HQ-VAE with the learnable down-and upsampling layers. B.3 PredictionHeadTransformer(PHT) Wepropose locally hierarchical decoding inPHT contrary tothestandard sequential approach by assuming the conditional independence among bottom codes given a top code. We use pixel-shuffle and -unshuffle for resizing operations as illustrated in (a) while recursively quantizing hierarchical feature maps to acquire three-levelcodes--top,middle,andbottom.

artificial intelligence, hq-transformer, machine learning, (18 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.32)

Add feedback

67d60c2694f4fecd18fa04d1fa8c0a5c-Paper-Conference.pdf

Neural Information Processing SystemsFeb-9-2026, 13:25:09 GMT

bottom code, top and bottom code, top code, (15 more...)

Neural Information Processing Systems

Country: Asia > South Korea > Seoul > Seoul (0.04)

Genre: Research Report (0.68)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

Locally Hierarchical Auto-Regressive Modeling for Image Generation

Neural Information Processing SystemsDec-24-2025, 09:31:32 GMT

We propose a locally hierarchical auto-regressive model with multiple resolutions of discrete codes. In the first stage of our algorithm, we represent an image with a pyramid of codes using Hierarchically Quantized Variational AutoEncoder (HQ-VAE), which disentangles the information contained in the multi-level codes. For an example of two-level codes, we create two separate pathways to carry high-level coarse structures of input images using top codes while compensating for missing fine details by constructing a residual connection for bottom codes. An appropriate selection of resizing operations for code embedding maps enables top codes to capture maximal information within images and the first stage algorithm achieves better performance on both vector quantization and image generation. The second stage adopts Hierarchically Quantized Transformer (HQ-Transformer) to process a sequence of local pyramids, which consist of a single top code and its corresponding bottom codes.

bottom code, hierarchical auto-regressive modeling, name change, (8 more...)

Neural Information Processing Systems

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (0.62)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.62)

Add feedback

67d60c2694f4fecd18fa04d1fa8c0a5c-Supplemental-Conference.pdf

Neural Information Processing SystemsAug-15-2025, 12:54:43 GMT

clip-score 0, hq-transformer, image generation, (16 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.35)

Add feedback

Locally Hierarchical Auto-Regressive Modeling for Image Generation

Neural Information Processing SystemsAug-15-2025, 12:54:36 GMT

This work was done during an internship of this author at Kakao Brain.

bottom code, top and bottom code, top code, (15 more...)

Neural Information Processing Systems

Country: Asia > South Korea > Seoul > Seoul (0.04)

Genre: Research Report (0.68)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

Locally Hierarchical Auto-Regressive Modeling for Image Generation

Neural Information Processing SystemsOct-11-2024, 10:29:51 GMT

We propose a locally hierarchical auto-regressive model with multiple resolutions of discrete codes. In the first stage of our algorithm, we represent an image with a pyramid of codes using Hierarchically Quantized Variational AutoEncoder (HQ-VAE), which disentangles the information contained in the multi-level codes. For an example of two-level codes, we create two separate pathways to carry high-level coarse structures of input images using top codes while compensating for missing fine details by constructing a residual connection for bottom codes. An appropriate selection of resizing operations for code embedding maps enables top codes to capture maximal information within images and the first stage algorithm achieves better performance on both vector quantization and image generation. The second stage adopts Hierarchically Quantized Transformer (HQ-Transformer) to process a sequence of local pyramids, which consist of a single top code and its corresponding bottom codes.

bottom code, hierarchical auto-regressive modeling, image generation, (5 more...)

Neural Information Processing Systems

Genre: Play > Prospect > Charge (0.66)

Technology: